EFFECT OF DICHOTOMIZING CONTINUOUS VARIABLES IN REGRESSION MODELS by
نویسندگان
چکیده
Jose Francisco Cumsille. Effect of Dichotomizing Continuous Variables in Regression Models (Under the Direction of Dr. Shrikant I. Bangdiwala). The dichotomization of continuous variables is a very common practice when people analyze data. Some of the consequences of such dichotomization are well known: loss of information, grouping people in the same group when they are different, loss of power of the statistical methods, underestimation of the correlation coefficient, among others. The dichotomization is not only a methodology issue; this practice can have a strong impact in the interpretation of the empirical results. One of the most popular tools to analyze data is the regression model. It is very common to see the results from a particular linear regression model, where one or more continuous variables have been dichotomized. The objectives of this dissertation are to study the effect on the model structure of categorization of continuous variables in multiple linear regression models, and also to study the effect on the exposure (main effect) of categorizing a continuous confounding variable in multiple linear regression models and also in multiple logistic regression models. These considerations imply that the outcome can be continuous or binary and the confounding variable must be continuous. About the main effect, we study the situation in which it is also a continuous variable. In order to evaluate the consequence in terms of the model structure, an analytic approach was used, whereas to evaluate the impact on the measure of association between the outcome and the exposure, a simulation approach was used for both the linear and logistic regression situations. The effect on the measure of association was evaluated by using a variety ofmethodologies.-
منابع مشابه
Dichotomizing continuous predictors in multiple regression: a bad idea.
In medical research, continuous variables are often converted into categorical variables by grouping values into two or more categories. We consider in detail issues pertaining to creating just two groups, a common approach in clinical research. We argue that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power ...
متن کاملپایش پروفایل با پاسخ چند رسته ای اسمی
In certain statistical process control applications, quality of a process or product can be characterized by a function between response variable and one or more independent variables. This function commonly referred to as profile. Response variable can be continuous or discrete. All of the research assumes that the response variable is continuous. Whereas, some of the potential applications of...
متن کاملبهکارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر همخطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان
Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...
متن کاملLet Continuous Outcome Variables Remain Continuous
The complementary log-log is an alternative to logistic model. In many areas of research, the outcome data are continuous. We aim to provide a procedure that allows the researcher to estimate the coefficients of the complementary log-log model without dichotomizing and without loss of information. We show that the sample size required for a specific power of the proposed approach is substantial...
متن کاملTest for linearity between continuous confounder and binary outcome first , run a multivariate regression analysis second
Previous statistical studies have indicated that dichotomizing a continuous confounding variable in multivariate regression analyses can lead to biased estimation of the effect of exposures, treatments, and risk factors on outcomes. We suggest that, prior to entry in the multivariate analysis, one should test whether or not the continuous confounding variable is linearly related to log-odds of ...
متن کامل